New fuzzer mode: Fuzz against JavaScript#8655
Conversation
| if (!lub.noted()) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Yes, but who knows what might end up happening in the engine 🤷 Might as well exercise as many situations as we can. Without looking at the V8 code, it seems unlikely but plausible that there would be some different code path taken on traps or exceptions, or when doing a JSPI suspension or something, depending on the return type even when the function never does a normal return.
There was a problem hiding this comment.
Yeah, I guess if something unrelated changes the output we would need to entirely regenerate the test.
I still wish there were a more declarative way of doing this kind of property testing on our fuzzers, though. Can we at least refactor this into a generic utility for running the fuzzer iteratively and checking that the output satisfies various user-provided predicates with given probabilities? Laying this into reasonable abstractions would help make it more palatable.
| if (newHeapType != new_.getHeapType() || newHeapType.isBasic()) { | ||
| newExactness = Inexact; | ||
| } |
There was a problem hiding this comment.
Might as well check these conditions before burning a bit to generate a new exactness above.
| if (!lub.noted()) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
A random subtype of the old result type, yes. The maybeRefine call below should not need to change.
| if (newHeapType != new_.getHeapType() || newHeapType.isBasic()) { | ||
| newExactness = Inexact; | ||
| } |
|
|
||
| # Given the types we saw for params or results, look in detail for the | ||
| # things we expect to see. | ||
| def found_expected(self, data): |
There was a problem hiding this comment.
IIUC, this looks for a single iteration where all of these things are true. But that's stricter than we want. Can we make each of these conditions a bool on the class, set them to true whenever we see them, and pass the test once all have been seen?
There was a problem hiding this comment.
This test's target properties all increase monotonically anyhow: as more things are seen, we only ever add them, making passing more likely.
| // We receive the new type, computed as a LUBFinder, and the index in that | ||
| // LUB. | ||
| auto maybeRefine = [&](Type old, LUBFinder newLUB, Index lubIndex) { |
There was a problem hiding this comment.
Why not keep passing the new type in as a Type? The only interesting thing we do with newLUB below is check newLUB.noted(), but shouldn't we be able to just check whether the new type (i.e. the LUB) is Type::unreachable?
There was a problem hiding this comment.
Either could work, but noted() is a more explicit way to check whether we saw any type there, I think?
There was a problem hiding this comment.
I don't think it's worth passing all this extra external state into the lambda, though.
There was a problem hiding this comment.
LUBFinder is just a type + logic to track whether we noted, so it seems suitable?
Also note that the LUB tracks all the indexes at once (they are all unreachable or not), hence we pass in the index after it. This avoids each caller needing to check the unreachable case and indexing only if not.
There was a problem hiding this comment.
It just seems like a simpler abstraction to me if maybeRefine just takes the new and old types, but I don't want to block landing on this.
There was a problem hiding this comment.
I refactored in the last commit to make that method just use types, but there is an external maybeRefineIndex that takes the entire types and parses unreachability (that avoids the duplicated code that I want to prevent here).
| if (!old.isRef()) { | ||
| return old; | ||
| } | ||
| return maybeRefine(old, Type(old.getHeapType().getBottom(), NonNullable)); |
There was a problem hiding this comment.
I think this could all be replaced with the following:
auto lub = newLUB.noted() ? newLUB.getLUB()[index] : Type(Type::unreachable);
return maybeRefine(oldTypes[index], lub);
if we also added this to maybeRefine:
if (new_ == Type::unreachable) {
new_ = Type(oldHeapType.getBottom(), NonNullable);
}
But I see that the noted and non-noted cases requiring different indexing is the source of the extra work. We could fix that by allowing arbitrary indexing into Type::unreachable (and having it return Type::unreachable at every index), but it's probably not worth making that change to simplify just this code.
Whatever way of structuring this you think is best SGTM.
There was a problem hiding this comment.
Ah yes, handling unreachable in maybeRefine is a little simpler, thanks. Done.
The new fuzzer flag
--fuzz-against-jstells the fuzzer we will only run thewasm against JS - not link it to wasm or something else. This lets it make
changes that are valid from JS's point of view, like refining things on the
boundary while not changing the arity.
For example, if we sent JS an anyref, but the actual type we send is
(ref $A)then we can refine to that type (or any type between it andanyref). We can do this for both export results and import params, as in
both cases we send things to JS and know their type.
Original idea was @tlively 's. This is useful for fuzzers that generate JS
and let Binaryen mutate the wasm: they can emit anyrefs on the
boundary, and Binaryen will be able to add new GC types in the
module and even refine the boundary to those types. Such a fuzzer
does not even need to emit GC types itself (it can emit anyref and send
only nulls). cc @rmahdav